Language/OS - Multiplatform Resource Library

home *** CD-ROM | disk | FTP | other *** search

/ Language/OS - Multiplatform Resource Library / LANGUAGE OS.iso / dsp / fft / fft_eyal.lha / fft_eyal / readme < prev

Wrap

Text File | 1991-09-01 | 6.3 KB | 141 lines

This is the second release of these programs. The first release (packaged by Udi Finkelstein) included only the m68k programs. This is my implementation of integer real fft computation. It was designed for high speed analysis of audio input in real time on a PC/AT. It has support for generic C and optimized assembly for: - intel 80x86 - motorola 68k - natsemi 32k The assembler is usualy 4-8 times faster than the generic C so it is worth using. The data used was 8bit but the routine does 16bit signed arithmetic. The algorithm of choice was one which minimizes mults while keeping the number of needed registers to a minimum (the Intel chip does not have enough). Data if gradualy scaled in flight to avoid overflow. A standard (2,4) split-radix was used as described in [Sorensen et al., IEEE ASSP-35 no 6., June 87, pp. 849-863]. Other algoriths (WFT etc.) where found to use too complex arithmetic for efficient implementation on a machine with a decent mult instruction because register spills slow it more than the reduce mults speed it. WHAT IT DOES NOT DO: - NO inverse fft. - NO complex input. - NO windowing. - output is power spectrum WITHOUT the square root taken. - the sample data array (x[]) is not descramled so you cannot use it directly. see how the power spectrum is derived if you want to generate phase or whatever. fft7() and fft8() function do this post-processing. The method of implementaion was to have a program (fftg.c) that generates the fft routine which is then compiled/assembled and used. The generated program is assembly, but for portability this release generates a C program too. I have routines for Intel assembly and ns32k assembly, other machines can have the basic routines re-written without much effort (but for best performance one would optimize these routines extensively). The archive contains a sample program (fft2.c) that uses the fft routine. It draws some basic picture on a screen using move() and draw() function which you will have to supply (it uses the microsoft c graphics library in hercules mode using msherc.com). To run the test program: 'fft2a' or 'fft2c' or 'fft2f'. you can try 'fft2a x' which will call fft 10000 times and display the time, then do 'fft2a x x' to measure the overhead of the loop, then a result of (e.g.) 29 from the first and 3 from the second means an average time of 2.6ms/fft. When the program asks for a data file respond 'd1' of 'd2' etc. or 'end' to stop. Originaly the program was a BASIC program from some British mag a few years ago which started my interest; it took a few minutes to do one fft. The current version does a 256 point fft on a 25MHz ns32532 in 2.6ms(asm)/4.2ms(c)/5.8ms(funcs), on a 10MHz 80286 it takes 7ms(asm, it has a rather fast int mult)/38.8ms(c)/44.4ms(funcs). As you see, gcc on the ns32k compares better to hadcrafted assembly than msc5.1. I now use a 20MHz 80286. If you sample at 44Khz, 256 points take 5.8ms to sample but only 2.9ms to do the fft (assuming the sampling is done via DMA, inerrupts would eat too much cpu time), so you have 2.9ms for other stuff (display? this is slow unless you really try. I wrote my low level display routines or the fft gain would be lost). On a 386/486 you are laughing. The package contains one ready fft() in the file fft8c.f; it does a 256 point fft in the slowest way but it should compile readily. A program to measure the time is also supplied as fft1.c. You will need to link the following: fft1.o fft8f.o fftsubs.o then run the result to see how fast it goes. Be patient, the test takes about one minute. On some machine (where 'mult' has variable time) this test will give only the typical speed. To get the generator programs 'make progs'. A generator program is built from two modules: fft.g (the core) and fftout?.c (the output driver). You should choose one of the drivers or write your own if there is no support for your machine (in which case I would like to hear from you). The driver fftoutf.c generates a C program that calls functions in fftsubs.c; this should work on every system. Once you have the generator, you need to run it and generate the fft program: fftgf fft8f 8 fft this will build a 256 (2^8) points fft() function into fft8f.c. Now compile and then link into your application. All of the above is done in the makefile, look at it. To get the fft programs 'make fft'. The 'c' version uses fftsubs.h. The 'f' version uses C function calls which is a bit slower but uses far less memory; it uses fftsubs.c. The '86' version uses fftsubs.mac, Note that the makefile generates a 256 point routine with entry point fft(), you may want to change it for your needs or just edit the output. fftouts.c generates ns32k assembly, fftouta.c generates Intel assembly. These should give a good idea of how to port the assembly level routines to another machine. To get the test programs 'make tests' (but first fix fft2.c for the graphics if needed). MSDOS NOTE: The c output has to be split into smaller files because a) the compiler wont optimize a large prog b) the compiler dies on any larger progs, and c) the code segment would be too large. The output routines in fftoutc.c will break the output every 1000 statements. This is fine for msc5.1, other compilers may cope with larger programs, so just up the limit in break_fft(). HOW TO MODIFY FFT: The program to modify is fftg.c and the related fftout?.c that you use. You would probably modify the latter. In general, the calls to fft1...fft5 handle the fft proper while fft7+fft8 do the final power spectrum and reordering. The last two is where you extract the results you wanted from x[] into qf[]. x[] holds the real and complex parts in the order: xx[0-255]= R[0],...,R[128],I[127],...,I[1] but in this program xx[i] is in position x[mm[i]]. HOW TO CALL FFT: In your program header include: short near x[256] = {0}; /* some compilers need to initialize */ short near qf[128+1] = {0}; extern far fft (); /* or whatever you called it */ Then in the program: /* read the data points into x[0-255]*/ fft (); /* the power spectrum is in qf[0-128], qf[0] is the DC componnent */ Thats all. Please let me know of problems/difficulties you encounter with these programs as well as any suggestions/improvements. Regards Eyal Lebedinsky eyal@ise.canberra.edu.au